troubling theory
Gradient Descent, clearly explained in Python, Part 1: The troubling theory.
If you have ever done a Kaggle competition, these would be commonly referred to as evaluation metrics. Typically, the lower the loss, the better the performance of your model. So if,for example, you were predicting house prices and using Mean Squared Error, and your cost was $25000, that means that your model is performing poorly as it is making a prediction error of $25000. Going back to our analogy, if you imagine that instead of a mountain there is a U-shaped curve, and instead of a person there is the cost function with maybe an initial cost value of 25,500. The aim of Gradient Descent would be to minimise this cost to either 0(global minimum), or something much smaller(local minimum).